Speeding Up OpenMP Tasking
نویسندگان
چکیده
In this work we present a highly efficient implementation of OpenMP tasks. It is based on a runtime infrastructure architected for data locality, a crucial prerequisite for exploiting the NUMA nature of modern multicore multiprocessors. In addition, we employ fast work-stealing structures, based on a novel, efficient and fair blocking algorithm. Synthetic benchmarks show up to a 6-fold increase in throughput (tasks completed per second), while for a task-based OpenMP application suite we measured up to 87% reduction in execution times, as compared to other OpenMP implementations.
منابع مشابه
Evaluating OpenMP Tasking at Scale for the Computation of Graph Hyperbolicity
We describe using OpenMP to compute δ-hyperbolicity, a quantity of interest in social and information network analysis, at a scale that uses up to 1000 threads. By considering both OpenMP workshare and tasking models to parallelize the computations, we find that multiple task levels permits finer grained tasks at runtime and results in better performance at scale than worksharing constructs. We...
متن کاملApproaches for Task Affinity in OpenMP
OpenMP tasking supports parallelization of irregular algorithms. Recent OpenMP specifications extended tasking to increase functionality and to support optimizations, for instance with the taskloop construct. However, task scheduling remains opaque, which leads to inconsistent performance on NUMA architectures. We assess design issues for task affinity and explore several approaches to enable i...
متن کاملPerformance Profiling for OpenMP Tasks
Tasking in OpenMP 3.0 allows irregular parallelism to be expressed much more easily and it is expected to be a major step towards the widespread adoption of OpenMP for multicore programming. We discuss the issues encountered in providing monitoring support for tasking in an existing OpenMP profiling tool with respect to instrumentation, measurement, and result presentation.
متن کاملOpenMP Tasking Model for Ada: Safety and Correctness
The safety-critical real-time embedded domain increasingly demands the use of parallel architectures to fulfill performance requirements. Such architectures require the use of parallel programming models to exploit the underlying parallelism. This paper evaluates the applicability of using OpenMP, a widespread parallel programming model, with Ada, a language widely used in the safety-critical d...
متن کاملOpenMP 3.0 Tasking Implementation in OpenUH∗
As multicore technology dominates the processor market, new methodologies are being explored to exploit the parallelism inherent to these architectures and shared memory programming models are gaining in popularity. The ratification of the OpenMP 3.0 API has provided compiler developers with another challenge as the multicore revolution reshapes the landscape in scientific computing. The introd...
متن کامل